part 0
Model Checking Strategies from Synthesis Over Finite Traces
Bansal, Suguman, Li, Yong, Tabajara, Lucas Martinelli, Vardi, Moshe Y., Wells, Andrew
The innovations in reactive synthesis from {\em Linear Temporal Logics over finite traces} (LTLf) will be amplified by the ability to verify the correctness of the strategies generated by LTLf synthesis tools. This motivates our work on {\em LTLf model checking}. LTLf model checking, however, is not straightforward. The strategies generated by LTLf synthesis may be represented using {\em terminating} transducers or {\em non-terminating} transducers where executions are of finite-but-unbounded length or infinite length, respectively. For synthesis, there is no evidence that one type of transducer is better than the other since they both demonstrate the same complexity and similar algorithms. In this work, we show that for model checking, the two types of transducers are fundamentally different. Our central result is that LTLf model checking of non-terminating transducers is \emph{exponentially harder} than that of terminating transducers. We show that the problems are EXPSPACE-complete and PSPACE-complete, respectively. Hence, considering the feasibility of verification, LTLf synthesis tools should synthesize terminating transducers. This is, to the best of our knowledge, the \emph{first} evidence to use one transducer over the other in LTLf synthesis.
Hyperparameter optimization in Python. Part 0: Introduction.
Hyperparameter optimization, or HPO as cool kids like to call it, is quickly becoming common knowledge in data science. Anything, with hyper in the name sounds cool enough, but what does it actually do and why should you care? Every machine learning algorithm has a certain number of parameters that you define before you start training. The number of layers, depth of a tree or the amount of regularization are just some examples of such (hyper)parameters. Once those are defined you can feed the data to your model, train it and evaluate its performance. Hyperparameter optimization is just the process of tweaking hyperparameters to achieve the highest performance under some time constraint.
My Journey to Reinforcement Learning -- Part 0: Introduction
When we google reinforcement learning, we can see images like above, over and over again. So rather than seeing an agent or environment, lets actually think about this as a process where a baby is learning how to walk. " The "problem statement" of the example is to walk, where the child is an agent trying to manipulate the environment (which is the surface on which it walks) by taking actions (viz walking) and he/she tries to go from one state (viz each step he/she takes) to another. The child gets a reward (let's say chocolate) when he/she accomplishes a sub module of the task (viz taking couple of steps) and will not receive any chocolate (a.k.a negative reward) when he/she is not able to walk. This is a simplified description of a reinforcement learning problem."